Everything about Context-free Grammar totally explained
In
formal language theory, a
context-free grammar (
CFG) is a
grammar in which every
production rule is of the form:
V →
w
where
V is a single
nonterminal symbol, and
w is a string of
terminals and/or nonterminals (possibly empty). The term "context-free" expresses the fact that nonterminals can be rewritten without regard to the context in which they occur. A
formal language is
context-free if some context-free grammar generates it.
Context-free grammars play a central role in the description and design of
programming languages and
compilers. They are also used for analyzing the
syntax of
natural languages.
Background
Since the time of
Pāṇini, at least, linguists have described the
grammars of languages in terms of their
block structure, and described how sentences are
recursively built up from smaller phrases, and eventually individual words or word elements.
The context-free grammar (or "phrase-structure grammar"
in the mid-
1950s, took the manner in which linguistics had described this grammatical structure, and then turned it into rigorous mathematics. A context-free grammar provides a simple and precise mechanism for describing the methods by which phrases in some natural language are built from smaller blocks, capturing the "block structure" of sentences in a natural way. Its simplicity makes the formalism amenable to rigorous mathematical study, but it comes at a price: important features of natural language syntax such as
agreement and
reference can't be expressed in a natural way, or not at all.
Block structure was introduced into computer
programming languages by the
Algol project, which, as a consequence, also featured a context-free grammar to describe the resulting Algol syntax. This became a standard feature of computer languages, and the notation for grammars used in concrete descriptions of computer languages came to be known as
Backus-Naur Form, after two members of the Algol language design committee.
The "block structure" aspect that context-free grammars capture is so fundamental to grammar that the terms syntax and grammar are often identified with context-free grammar rules, especially in computer science. Formal constraints not captured by the grammar are then considered to be part of the "semantics" of the language.
Context-free grammars are simple enough to allow the construction of efficient
parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar. An
Earley parser is an example of such an algorithm, while the widely used
LR and
LL parsers are more efficient algorithms that deal only with more restrictive subsets of context-free grammars.
Formal definitions
A context-free grammar
G is a 4-
tuple:
where
1.
is a finite set of
non-terminal characters or variables. They represent different types of phrase or clause in the sentence.
2.
is a finite set of
terminals, disjoint with
, which make up the actual content of the sentence.
3.
is a relation from
to